Learning to lemmatise Polish noun phrases
نویسنده
چکیده
We present a novel approach to noun phrase lemmatisation where the main phase is cast as a tagging problem. The idea draws on the observation that the lemmatisation of almost all Polish noun phrases may be decomposed into transformation of singular words (tokens) that make up each phrase. We perform evaluation, which shows results similar to those obtained earlier by a rule-based system, while our approach allows to separate chunking from lemmatisation.
منابع مشابه
Use of Articles in Learning English as a Foreign Language: A Study of Iranian English Undergraduates
The significance of error analysis for the learner, the teacher and the researcher is now widely recognized. Earlier studies of error analysis concentrated on intersystematic comparison of the “native language” and the “target language” and drew the required data largely from intuitions and impressionistic observations. This study was conducted on the basis of the following observations: (1) to...
متن کاملLemmatization of Multi-word Common Noun Phrases and Named Entities in Polish
In the paper we present a tool for lemmatization of multi-word common noun phrases and named entities for Polish called PoLem1. The tool is based on a set of manually crafted rules and heuristics utilizing a set of dictionaries (including morphological, named entities and inflection patterns). The accuracy of lemmatization obtained by the tool reached 97.99% on a dataset with multi-word common ...
متن کاملThe Puzzle of Case Agreement between Numeral Phrases and Predicative Adjectives in Polish
This paper addresses the optionality of case agreement between a numeral phrase in the subject position and its modifying or predicating adjectives in Polish: such adjectives either agree with the numeral or – apparently – reach into the numeral phrase and agree with the noun phrase within it. While previous analyses of this phenomenon postulated special agreement mechanisms, we account for the...
متن کاملTowards the Lemmatisation of Polish Nominal Syntactic Groups Using a Shallow Grammar
While morphological analysers and taggers usually assign lemmata to wordforms, those tools focus on single words. For some tasks a tool that lemmatises (and thus normalises) whole phrases would be more appropriate. The paper presents, discusses and evaluates a set of tools to lemmatise nominal groups, based on a shallow grammar for Polish. The tools reach an overall success rate of over 58%, an...
متن کاملSome Aspects of Semantic Representation of Polish Determiners 185 Some Aspects of Semantic Representation of Polish Determiners Wybrane aspekty reprezentacji semantycznej określników języka polskiego
The paper concerns some methods of semantic analysis of Polish determiners which can be used in Machine Translation. First, a brief summary of traditional approaches to the semantics of Polish noun phrases is presented together with a short discussion. The difference between reference and quantification is argued to be important for proper understanding of some phenomena. Next, a unified model ...
متن کامل